Gong, Zhenhuan. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (under the Direction of Dr. Nagiza F. Samatova.) Multi-level Data Layout Optimization for Heterogeneous Access Patterns
نویسنده
چکیده
GONG, ZHENHUAN. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (Under the direction of Dr. Nagiza F. Samatova.) Recent years have seen an enormous increase in computation power of leadership computing facilities. As a result, huge amounts of data, from terascale to petascale, are being produced by scientific applications running on supercomputers. However, the I/O subsystems have not developed with a comparable speed, making data I/O and storage the major bottleneck in modern computing architectures. The problem gets exacerbated by the need to perform data-intensive analytic jobs, such as queries with multiple constraints on these datasets stored on external storage. Scientific applications produce multi-dimensional, multi-variate, double-precision datasets, and these datasets are usually stored on large-scale parallel file systems. The datasets are not well-represented by traditional relational data models. Queries on scientific datasets involve multiple constraints, thus producing heterogeneous I/O access patterns. How extreme-scale datasets are linearized and organized on parallel file systems is crucial to the data read performance for queries: the optimized layout results in more sequential reads on contiguous data blocks, which are much faster than seek-and-reads on non-contiguous small blocks. Existing data layout optimization techniques, while successfully improved read performance for certain application-specific access patterns, have failed to address more general and heterogeneous access patterns. They also often do not scale to the expected substantial growth in data size reaching exascale in the near future. Moreover, existing technology usually performs data layout optimization in a post-processing way on datasets on storage systems. The post-processing approaches read the entire datasets on storage, perform layout optimization, and write the processed data back to storage, which is extremely inefficient due to the I/O bottleneck in modern computer architecture, and the huge size of the datasets. There is a lack of a general framework to perform data layout optimization for scientific datasets at simulation run time or I/O time, before datasets are written to storage, to reduce I/O and storage overhead, and speed up the entire process. To address the problems, a multi-level data storage layout scheme is presented, which optimizes for heterogeneous access patterns induced by different types of queries for scientific data analysis. First, a hybrid layout scheme is presented, which is optimized for two common access patterns: value-constrained accesses and space-constrained accesses. The layout scheme improves data locality and reduces the latency bound I/O operations (such as seeks) substantially. This layout scheme is further generalized by MLOC, a parallel M ultilevel Layout Optimization framework for C ompressed scientific spatio-temporal data at extreme scale. MLOC includes multiple fine-grained data layout optimization kernels that form a generic core from which a broader constellation of such kernels can be organically consolidated in a hierarchical multilevel architecture, in order to enable an effective data exploration with various combinations of access patterns. Specifically, the kernels are optimized for access patterns induced by (a) query-driven multi-variate, spatio-temporal constraints, (b) precision-driven data analytics, (c) compression-driven data reduction, (d) multi-resolution data sampling, and (e) multi-file data partitioning and organization on parallel file systems. When tested on query-driven exploration of compressed data, MLOC demonstrates an order of magnitude faster query response time compared to the state-of-the-art scientific database management technology. Based on MLOC, a parallel run-time data layout optimization framework, PARLO, is presented to perform data layout optimization on scientific datasets at simulation run time, By processing the datasets when they are still in memory before written to disks, PARLO successfully removes additional I/O and storage overhead, and significantly reduces overall processing time compared with traditional post-processing approaches. PARLO is integrated with ADIOS, a popular parallel I/O middleware. With support of ADIOS’s I/O libraries and the portable BP file formats, PARLO achieves high-performance, parallel, and user-transparent data layout optimization at run time. c © Copyright 2013 by Zhenhuan Gong
منابع مشابه
Parallel Data Layout Optimization of Scientific Data through Access-driven Replication
Efficient I/O on large-scale spatio-temporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these ...
متن کاملRADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication
Efficient I/O on large-scale spatiotemporal scientific data requires scrutiny of both the logical layout of the data (e.g., row-major vs. column-major) and the physical layout (e.g., distribution on parallel filesystems). For increasingly complex datasets, hand optimization is a difficult matter prone to error and not scalable to the increasing heterogeneity of analysis workloads. Given these f...
متن کاملMaximum Maintainability of Complex Systems via Modulation Based on DSM and Module Layout.Case Study:Laser Range Finder
The present paper aims to investigate the effects of modularity and the layout of subsystems and parts of a complex system on its maintainability. For this purpose, four objective functions have been considered simultaneously: I) maximizing the level of accordance between system design and optimum modularity design,II) maximizing the level of accessibility and the maintenance space required,III...
متن کاملData layout optimization for multi-valued containers in OpenCL
Scientific data is mostly multi-valued, e.g., coordinates, velocities, moments or feature components, and it comes in large quantities. The data layout of such containers has an enormous impact on the achieved performance, however, layout optimization is very time-consuming and error-prone because container access syntax in standard programming languages is not sufficiently abstract. This means...
متن کاملA Compile-Time Data Locality Optimization Framework for NUCA Chip Multiprocessors
With increasing numbers of cores, future CMPs (Chip MultiProcessors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. For data-parallel programming models, there is a mismatch between such a non-uniform cache organization and the canonical row-major or column-major layouts of multi-dimensional arrays...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013